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Introduction: 

The  completion  of  the  Human  Genome  Project  in  2003  unearthed  the  field  of 
functional  genomics  as  a  new  challenge  to  understanding  humans  at  the  molecular  level. 

Efficient  interpretation  of  the  functions  of  human  genes  requires  resources  and  strategies  to  be 
developed  to  enable  large-scale  investigations  across  entire  genomes.  Recent  advances  in  the 
field  of  RNA  interference  have  enabled  researchers  to  conduct  large-scale  loss-of-function 
studies  in  mammals  and  address  the  trials  presented  by  functional  genomics. 

Most  eukaryotic  cells  harbor  a  natural  response  to  double-stranded  RNAs  (dsRNA)  that 
inhibits  gene  expression  in  a  sequence-specific  manner1 .  DsRNA  silencing  triggers  are  processed 
into  small  RNAs  (siRNAs  and  miRNAs)  that  engage  the  RNA-induced  silencing  complex 
(RISC)  to  suppress  expression  of  homologous  targets.  In  cases  in  which  the  small  RNA  is 
perfectly  complementary  to  the  target,  that  RNA  is  cleaved  and  ultimately  destroyed1,2.  This 
pathway,  known  as  RNA  interference  (RNAi),  has  been  exploited  in  organisms  ranging  from 
plants  to  fungi  to  animals  for  deciphering  gene  function  through  suppression  of  gene  expression. 
Particularly  in  systems  where  targeted  genetic  manipulation  is  difficult  or  time  consuming,  RNAi 
has  transformed  the  way  in  which  gene  function  can  be  approached  on  a  single  gene  or  genome- 
wide  level  '  . 

In  mammals,  RNAi  can  be  initiated  in  several  ways.  The  most  prevalent  method  of 
triggering  RNAi  is  the  delivery  of  one  or  more  small  interfering  RNAs  (siRNAs).  SiRNAs  are 
duplexes  of -21-22  nucleotides  that  bear  two  nucleotide  3’  overhangs1’8.  One  strand  (the  guide 
strand)  of  the  siRNA  is  incorporated  into  the  effector  complex  of  RNAi,  the  RNA-induced 
Silencing  Complex,  RISC  and  guides  substrate  selection  via  base  pairing  to  its  complementary 
target  ’  .  The  RNAi  machinery  can  also  be  programmed  by  endogenous  sources  of  double- 
stranded  RNA.  The  most  well  characterized  source  of  endogenous  triggers  for  the  RNAi 
machinery  are  the  microRNA  genes9’10.  Numerous  studies  have  demonstrated  that,  in  animals, 
miRNAs  are  transcribed  to  generate  long  primary  polyadenylylated  RNAs  (pri-miRNAs)1 1’12. 
Through  mechanisms  not  yet  fully  understood,  the  pri-microRNA  is  recognized  and  cleaved  at  a 
specific  site  by  the  nuclear  Microprocessor  complex13-17  to  produce  a  -70-90  nucleotide 
microRNA  precursor  (pre-miRNA)  which  is  exported  to  the  cytoplasm  18’19.  Only  then  is  the  pre- 
miRNA  recognized  by  Dicer  and  cleaved  to  produce  a  mature  microRNA.  This  probably 
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involves  recognition  of  the  2  nucleotide  3’  overhang  created  by  Drosha  to  focus  Dicer  cleavage 
at  a  single  site  ~22  nucleotides  from  the  end  of  the  hairpin20. 

Previously,  several  groups,  including  our  own,  described  the  design  and  construction  of 
arrayed  short  hairpin  RNA  (shRNA)  libraries  that  covered  a  fraction  (—1/3)  of  human  genes21,22. 
At  the  time  when  these  tools  were  developed,  our  knowledge  of  microRNA  maturation  was 
relatively  incomplete.  This  led  most  groups  to  the  notion  of  expressing  a  simple  hairpin  RNA 
that  mimicked  the  premiRNA.  As  our  knowledge  of  the  microRNA  processing  pathway  and  our 
understanding  of  strand  preferences  for  RISC  loading  have  grown,  it  seemed  prudent  to 
reevaluate  whether  the  performance  of  encoded  triggers  of  the  RNAi  pathway  might  be 
improved  by  remodeling  a  primary  miRNA  transcript  to  experimentally  alter  its  targeting 
capability.  Indeed  such  strategies  have  previously  succeeded  in  both  plants  and  animals23,24. 

My  initial  studies  focused  on  the  biology  of  miRNA  processing  which  guided  the  new 
design  of  the  Hannon-Elledge  shRNA  library.  Concurrently,  we  have  been  testing  shRNAs  and 
developing  screens  that  would  allow  us  to  target  genes  involved  in  apoptosis  and  growth  arrest. 
Our  approach  is  studying  synthetic  lethal  interactions  with  p53  in  cancer  cells.  This  tumor 
suppressor  protein  induces  apoptotic  cell  death  in  response  to  oncogenic  stress.  Loss  of  p53 
function  causes  malignant  progression  through  a  mutation  in  the  gene  that  encodes  p53  or  by 
defects  in  the  signaling  pathways  that  are  upstream  or  downstream  of  p5  325.  The  ultimate  goal  is 
to  use  RNAi  as  a  genetic  tool  to  find  molecular  vulnerabilities  unique  to  breast  cancer  cells. 
These  vulnerabilities  are  potential  chemotherapeutic  targets  that  can  be  exploited  to  kill  cancer 
cells. 
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Body: 

The  biological  studies  described  in  my  last  update  and  our  lab’s  ability  to  synthesize  complex 
oligonucleotide  populations  using  in  situ  microarray  DNA  synthesis  contributed  to  the  evolution 
of  the  new  Hannon  Elledge  library  which  will  enable  a  more  effective  genetic  loss  of  function 
studies  in  mammalian  systems.  The  result  of  this  work  is  attached  in  the  appendix,  entitled 
“Second-Generation  shRNA  Libraries  Covering  the  Mouse  and  Human  Genomes.”  In  addition,  I 
am  currently  in  the  middle  of  conducting  a  screen  searching  for  genes  that  would  be  synthetic 
lethal  with  p53.  p53  inhibits  tumor  cell  growth  by  evoking  several  responses  to  malignancy- 
associated  stress  signals  including  cell-cycle  arrest,  senescence,  and  apoptosis,  with  the  option 
chosen  being  dependent  on  many  factors  that  are  both  intrinsic  and  extrinsic  to  the  cell25.  p53 
can  also  contribute  to  the  repair  of  genotoxic  damage,  potentially  allowing  for  the  release  of 
damaged  cells  back  into  the  proliferating  pool.  Mutations  in  the  p53  gene  occur  in  about  half  of 
all  human  cancers,  almost  always  resulting  in  the  expression  of  a  mutant  p53  protein  that  has 
acquired  transforming  activity25. 

Synthetic  lethality  allows  us  to  functionally  define  vulnerabilities  in  cancer  cells  because 
cancer  arises  from  genetic  lesions  in  somatic  cells.  Synthetic  lethal  interactions  occur  when 
mutations  in  two  or  more  nonallellic  genes  synergize  to  kill  cells.  For  example,  a  particular 
mutation  may  be  tolerated  when  singly  present  in  cells,  but  when  combined  may  result  in  cell 
death.  Thus,  synthetic  lethal  interactions  reveal  situation  in  which  cellular  homeostasis  is  altered 
by  a  molecular  lesion  so  that  the  action  of  another  gene  or  pathway  is  required  to  compensate26. 
The  fact  that  cancer  cells  arise  from  genetic  alterations  makes  synthetic  lethality  ideally  suited 
for  identifying  cellular  targets  required  by  cancer  cells  for  viability. 

For  our  initial  screen,  we  have  selected  two  cells  lines  that  are  as  genotypically  identical 
as  possible  except  for  their  p53  status.  We  are  using  HCT1 16  and  HCT1 16  p53  null  colon  cancer 
cell  lines.  While  these  cell  lines  are  not  breast  cancer  cell  lines,  they  can  serve  as  a  basis  for  a 
more  developed  screen  in  MCF7  or  MCF10A  cell  lines.  We  can  work  out  the  conditions  for  our 
screen  in  these  genotypically  and  phenotypically  well  characterized  cell  lines  and  then  expand 
into  breast  cancer  cells.  This  would  serve  to  verify  our  initial  results  and  also  expose  genes  that 
are  responsible  for  apoptosis  in  combination  with  p53  loss  across  multiple  cancers.  We  can  also 
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compare  genes  that  would  cause  the  p53  null  cancer  cells  to  die  versus  cancer  cells  with  wild 
type  p53  . 

My  screen  began  with  transfecting  phoenix  packaging  cells  with  our  new  retroviral 
shRNA  library  constructed  in  a  new  format  as  described  in  the  appendix.  Four  shRNA  library 
subsets  were  used  targeting  human  kinases,  dual  specificity  phosphatases,  protein  tyrosine 
phosphatases,  and  a  c600  control  set  that  contains  hairpins  for  proteasome  subunits,  cell 
proliferation  genes  and  barcodes.  Protein  kinases  are  critical  components  of  cellular  signaling 
cascades  that  control  cell  proliferation  and  other  responses  to  external  stimuli.  Kinases  are 
attractive  drug  targets  as  their  dysfunction  can  result  in  cancer.  Viruses  were  pooled  and  both 
cell  lines  were  infected  separately  such  that  each  cell  is  targeted  to  carry  on  average  one  copy  of 
the  hairpin  expression  cassette.  There  are  several  advantages  to  this  approach  over  transiently 
transfected  screens:  The  knockdown  effects  can  be  monitored  over  extended  periods,  shRNA 
expression  is  more  normalized,  thereby  facilitating  the  screening  of  cells  in  pools,  and  finally, 
this  approach  is  very  adaptable  for  high  throughput  studies27. 

Cell  viability  can  be  assayed  in  two  ways.  One  will  be  using  an  fluorescent  MTT  dye 
reduction  assay.  Alternatively,  genomic  DNA  will  be  extracted  from  cells  at  0,  5,  10  and  15  days 
after  hairpin  selection  with  puromycin.  Cells  will  be  trypsinized,  pooled  and  replated  allowing 
them  to  grow  to  near  confluency.  This  will  entail  a  negative  selection  method  where  the  fate  of 
individual  shRNAs  in  a  complex  population  will  be  monitored  by  adopting  a  DNA-barcoding 
strategy.  Each  hairpin  is  linked  to  a  60mer  DNA  barcode  that  allows  us  to  virtually  count  the 
number  of  cells  that  contain  a  specific  shRNA  cassette  by  looking  at  barcode  representation  in  a 
cell  population  on  a  microarray.  Genomic  DNA  extracted  from  each  sample  will  be  assayed  for 
the  presence  of  DNA  barcodes  by  PCR.  This  DNA  will  be  amplified  using  a  primer  containing  a 
T7  RNA  polymerase  promoter  sequence  that  allows  for  in  vitro  transcription  of  fluorescently 
labeled  single-stranded  RNA  that  will  be  subsequently  hybridized  to  an  custom  microarray 
containing  complements  of  these  sequences.  This  technology  will  illustrate  cell  death  as  a  loss  of 
barcode  representation  in  a  population  of  cells.  Comparison  between  the  hybridization  patterns 
of  the  different  DNA  samples  over  time  allows  the  identification  of  shRNAs  that  are 
synthetically  lethal  with  p53  and  thus  cause  apoptosis  or  cell  growth  arrest. 
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Key  Research  Accomplishments: 

•  Assisting  in  the  construction  and  validation  of  second-generation  shRNA  (shRNAmir) 
expression  libraries  that  have  been  designed  based  on  an  increased  knowledge  of  RNAi 
biochemistry.  We  have  generated  large-scale  arrayed,  sequence-verified  libraries 
comprising  more  than  140,000  shRNAmir  expression  plasmids,  covering  a  substantial 
fraction  of  all  predicted  genes  in  the  human  and  mouse  genomes. 

•  The  initiation  of  a  p53  synthetic  lethal  screen  in  cancer  cells  that  can  identify  genes  that  are 
involved  in  apoptosis  and  growth  arrest. 

Reportable  Outcomes: 

Manuscripts: 

Jose  M.  Silva,  Mamie  Z.  Li,  Ken  Chang,  Wei  Ge,  Michael  C.  Golding,  Ricky  Rickies,  Despina 
Siolas,  Guang  Hu,  Patrick  J.  Paddison,  Michael  R.  Schlabach,  Nihar  Sheth,  Jeff  Bradshaw,  Julia 
Burchard,  Amit  Kulkami,  Guy  Cavet,  Ravi  Sachidanandam,  W.  Richard  McCombie,  Michele  A. 
Cleary,  Stephen  J.  Elledge,  and  Gregory  J.  Hannon.  Second-Generation  shRNA  Libraries 
Covering  the  Mouse  and  Human  Genomes,  (submitted:  Nature  Genetics  June  2005) 

Siolas,  D.,  Lerner  C.,  Burchard  J.,  Ge  W.,  Linsley  PS.,  Paddison  PJ.,  Hannon  GJ.,  Cleary  MA. 
(2005)  Synthetic  shRNAs  as  potent  RNAi  triggers.  Nature  Biotechnology.  23(2):227-31 

Presentations: 

Minisymposium  Talk: 

Siolas,  D.,  Lerner  C.,  Burchard  J.,  Ge  W.,  Linsley  PS.,  Hannon  GJ.,  Cleary  MA.  (2005) 

Synthetic  shRNAs  as  potent  RNAi  triggers.  American  Association  for  Cancer  Research 
Conference  Minisymposium  Presentation,  Anaheim,  California,  USA 

Poster: 

Siolas,  D.,  Lerner  C.,  Burchard  J.,  Ge  W.,  Linsley  PS.,  Hannon  GJ.,  Cleary  MA.  (2005) 

Synthetic  shRNAs  as  potent  RNAi  triggers.  Era  of  Hope,  Department  of  Defense  Breast  Cancer 
Research  Program,  Philadelphia,  PA  USA 
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Awards: 

Siolas,  D.  (2005)  AACR-AstraZeneca  Scholar-in-Training  Award.  American  Association  for 
Cancer  Research,  Philadelphia,  PA,  USA. 


Conclusions: 

The  enhancement  of  our  chemotherapeutic  arsenal  is  crucial  in  our  battle  to  conquer  cancer. 
The  shRNAmir  libraries  that  I’ve  helped  develop  provide  a  convenient,  flexible  and  effective  tool 
for  studying  gene  function  in  human  cells.  Our  new  knowledge  of  hairpin  processing  enabled  us  to 
enhance  the  silencing  capabilities  of  our  hairpins  by  applying  siRNA  silencing  guidelines  to  them. 

In  this  process,  we  were  able  to  make  more  potent  RNAi  triggers  that  unexpectedly  worked  more 
efficiently  than  siRNAs  with  identical  sequences.  Libraries  such  as  this  allow  the  use  of  RNAi  as  a 
genetic  tool  to  study  cancer  genes  and  identify  the  molecular  pathways  these  genes  affect. 

I  am  currently  using  this  library  to  study  genetic  vulnerabilities  in  cancer  cells  through 
screening  for  synthetic  lethal  combinations  with  p53  loss.  The  screen  has  been  started  in  HCT  1 1 6 
cells  that  have  either  wild  type  or  no  p53.  This  will  allow  us  to  work  out  the  conditions  of  the  screen 
and  expand  into  MCF  7  and  other  breast  cancer  cell  lines.  Using  shRNA  libraries  in  cancer  studies 
will  expand  our  biological  understanding  of  cancer  and  allowing  us  to  enhance  existing  cancer 
therapies  and  develop  new  ones. 
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Loss-of-function  phenotypes  often  hold  the  key  to  understanding  the 
connectivity  and  biological  functions  of  biochemical  pathways.  We  and 
others  have  previously  constructed  libraries  of  short  hairpin  RNAs 
(shRNAs)  that  allow  systematic  analysis  of  RNAi-induced  phenotypes  in 
mammalian  cells  12.  Here  we  report  the  construction  and  validation  of 
second-generation  shRNA  (shRNAm,r)  expression  libraries  that  have  been 
designed  based  on  an  increased  knowledge  of  RNAi  biochemistry.  In  these 
constructs,  silencing  triggers  have  been  designed  to  mimic  a  natural 
microRNA  primary  transcript,  and  each  target  sequence  has  been  selected 
based  on  thermodynamic  criteria  for  optimal  small  RNA  performance. 
Biochemical  and  phenotypic  assays  have  indicated  that  the  new  libraries 
are  substantially  improved  compared  to  first-generation  reagents.  We  have 
generated  large-scale  arrayed,  sequence-verified  libraries  comprising  more 
than  140,000  shRNAmir  expression  plasmids,  covering  a  substantial  fraction 
of  all  predicted  genes  in  the  human  and  mouse  genomes.  These  libraries 
are  presently  available  to  the  scientific  community. 


* 
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Introduction 


Most  eukaryotic  cells  harbor  a  natural  response  to  double-stranded  RNAs 
(dsRNA)  that  inhibits  gene  expression  in  a  sequence-specific  manner 3.  DsRNA 
silencing  triggers  are  processed  into  small  RNAs  (siRNAs  and  miRNAs)  that 
engage  the  RNA-induced  silencing  complex  (RISC)  to  suppress  expression  of 
homologous  targets.  In  cases  in  which  the  small  RNA  is  perfectly  complementary 
to  the  target,  that  RNA  is  cleaved  and  ultimately  destroyed  3'4.  This  pathway, 
known  as  RNA  interference  (RNAi),  has  been  exploited  in  organisms  ranging 
from  plants  to  fungi  to  animals  for  deciphering  gene  function  through  suppression 
of  gene  expression.  Particularly  in  systems  where  targeted  genetic  manipulation 
is  difficult  or  time  consuming,  RNAi  has  transformed  the  way  in  which  gene 
function  can  be  approached  on  a  single  gene  or  genome-wide  level  5'9. 

In  mammals,  RNAi  can  be  initiated  in  several  ways.  First,  RNA  molecules 
can  be  produced  chemically  10  or  enzymatically  in  vitro  f1'14  and  delivered  to  a 
cell.  The  most  prevalent  method  of  triggering  RNAi  is  the  delivery  of  one  or  more 
small  interfering  RNAs  (siRNAs).  SiRNAs  are  duplexes  of -21-22  nucleotides 
that  bear  two  nucleotide  3’  overhangs  3,1s.  One  strand  (the  guide  strand)  of  the 
siRNA  is  incorporated  into  the  effector  complex  of  RNAi,  the  RNA-induced 
Silencing  Complex,  RISC,  through  the  action  of  a  RISC  Loading  Complex,  RLC. 
Once  in  RISC,  the  siRNA  guides  substrate  selection  via  base  pairing  to  its 
complementary  target 3’4. 

At  the  heart  of  RISC  is  an  Argonaute  protein  16,  which  directly  contacts  the 
siRNA  17.  When  the  mRNA  is  engaged  by  this  complex,  the  siRNA-mRNA 
interaction  places  the  target  in  the  correct  alignment  with  the  nuclease  active  site 
or  “slicer”  within  the  Argonaute  PIWI  domain,  and  the  target  is  endonucleolytically 


The  RNAi  machinery  can  also  be  programmed  by  endogenous  sources  of 
double-stranded  RNA.  The  most  well  characterized  source  of  endogenous 
triggers  for  the  RNAi  machinery  are  the  microRNA  genes  21 22 .  It  was  initially 
assumed  that  microRNAs  were  transcribed  from  the  genome  as  short,  hairpin 
RNAs 23  that  were  directly  processed  by  Dicer  to  yield  the  mature  small  RNAs 
that  enter  RISC  24'27.  Over  the  past  year,  however,  a  different  picture  has 
emerged.  Numerous  studies  have  now  demonstrated  that,  in  animals,  miRNAs 
are  transcribed  by  RNA  polymerase  II  to  generate  long  primary  polyadenylylated 
RNAs  (pri-miRNAs)  28,29  These  primary  transcripts  probably  adopt  a  complex 
secondary  structure  and  fold,  in  the  areas  that  harbor  the  mature  microRNA 
sequences,  into  double-stranded  RNA  hairpins.  Through  mechanisms  not  yet 
fully  understood,  the  pri-microRNA  is  recognized  and  cleaved  at  a  specific  site  by 
the  nuclear  Microprocessor  complex 30'34.  This  contains  an  RNase  III  family 
enzyme,  Drosha,  that  cleaves  the  hairpin  to  produce  a  -70-90  nucleotide 
microRNA  precursor  (pre-miRNA)  with  a  2  nucleotide  3’  overhang  30.  This 
distinctive  structure  signals  transport  of  the  pre-miRNA  to  the  cytoplasm  by  a 
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mechanism  mediated  by  Exportin-5  35,36.  Only  then  is  the  pre-miRNA  recognized 
by  Dicer  and  cleaved  to  produce  a  mature  microRNA.  This  probably  involves 
recognition  the  2  nucleotide  3’  overhang  created  by  Drosha  to  focus  Dicer 
cleavage  at  a  single  site  ~22  nucleotides  from  the  end  of  the  hairpin  17,37. 

Mature  miRNAs  are  superficially  symmetrical,  with  2  nucleotide  3’ 
overhangs  at  each  end  having  been  generated  by  Drosha  and  Dicer, 
respectively.  However,  the  individual  strands  of  the  mature  miRNA  enter  RISC  in 
an  unequal  manner.  As  with  siRNAs,  the  thermodynamic  asymmetry  of  the  Dicer 
product  is  sensed  such  that  the  strand  with  the  less  stable  5’  end  has  a  greater 
propensity  to  enter  RISC  and  guide  substrate  selection  38,39.  This  observation  of 
thermodynamic  asymmetry  within  small  RNAs  led  to  the  development  of  rules  for 
predicting  effective  siRNA  sequences  that  have  greatly  improved  the  efficiency  of 
those  RNAs  as  genetic  tools. 

Previously,  several  groups,  including  our  own,  described  the  design  and 
construction  of  arrayed  short  hairpin  RNA  (shRNA)  libraries  that  covered  a 
fraction  (—1/3)  of  human  genes  .  At  the  time  when  these  tools  were  developed, 
our  knowledge  of  microRNA  maturation  was  relatively  incomplete.  This  led  most 
groups  to  the  notion  of  expressing  a  simple  hairpin  RNA  that  mimicked  the  pre- 
miRNA.  As  our  knowledge  of  the  microRNA  processing  pathway  and  our 
understanding  of  strand  preferences  for  RISC  loading  have  grown,  it  seemed 
prudent  to  reevaluate  whether  the  performance  of  encoded  triggers  of  the  RNAi 
pathway  might  be  improved  by  remodeling  a  primary  miRNA  transcript  to 
experimentally  alter  its  targeting  capability.  Indeed  such  strategies  have 
previously  succeeded  in  both  plants  and  animals  40,41 . 

Here  we  report  the  construction  of  a  new  generation  of  shRNA  libraries 
(shRNAm'r)  that  takes  into  consideration  our  advancing  understanding  of 
microRNA  biogenesis.  In  these  constructs,  the  shRNA  is  harbored  within  the 
backbone  of  the  primary  mir-30  microRNA.  This  natural  configuration  proved  to 
be  up  to  12  times  more  efficient  in  the  production  of  the  mature  synthetic  miRNAs 
than  the  previous  design.  Additionally,  we  have  biochemically  characterized 
processing  of  these  synthetic  microRNAs,  allowing  us  to  predict  the  mature  small 
RNA  product(s)  that  will  be  generated  from  each  vector.  This  has  allowed 
selection  of  target  sequences  that  maximize  efficiency  by  directing  preferential 
incorporation  of  the  correct  strand  into  RISC.  Using  these  criteria,  we  have 
produced  and  sequence-verified  more  than  140,000  shRNAs  covering  a 
substantial  fraction  of  the  predicted  genes  in  the  mouse  and  human  genomes. 

We  have  assayed  a  selected  subset  of  shRNAs  from  the  library  for  their  ability  to 
knock-down  the  expression  of  targeted  genes  by  quantitative  RT-PCR.  We  have 
also  tested  this  set  in  a  phenotypic  assay  and  compared  the  performance  of  the 
first-  and  second-generation  library  designs.  Overall,  the  shRNAmir  libraries  that 
we  describe  here  provide  a  convenient,  flexible  and  effective  tool  for  studying 
gene  function  in  human  cells.  Additionally,  they,  for  the  first  time,  extend  the 
possibility  of  large-scale  RNAi  screens  to  mouse  systems. 
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Results 


Design  and  construction  of  second  generation  shRNA  libraries 

We  have  previously  shown  that  expression  of  a  simple,  29  basepair  (bp) 
hairpin  from  a  U6  snRNA  promoter  can  induce  effective  suppression  of  target 
genes  when  delivered  either  transiently  or  stably  from  integrated  constructs 
r,42A3  We  a|SQ  founcj  t|iat  |onger  hairpin  structures  were  more  effective  inhibitors 
of  gene  expression  than  were  shorter  structures  with  stems  of  19-21  nucleotides. 
All  of  these  constructs,  however,  were  designed  to  express  a  pre-miRNA  hairpin, 
an  intermediate  in  microRNA  biogenesis,  rather  than  a  transcript  that  closely 
resembles  a  primary  microRNA.  Cullen  and  colleagues  had  previously  shown 
that  effective  suppression  could  be  achieved  by  redesigning  an  endogenous 
microRNA,  miR-30,  such  that  its  targeting  sequence  was  directed  against  a 
reporter  gene  40.  We  sought  to  compare  directly  the  abundance  of  small  RNAs 
produced  from  vectors  with  simple  hairpin  structures  to  those  that  more  closely 
resemble  a  natural  microRNA.  Since  it  had  been  previously  shown  that  the 
efficient  ectopic  expression  of  endogenous  microRNAs  requires  substantial 
flanking  sequence  44,  we  developed  a  vector  in  which  sequences  from  a 
remodeled  miR30  are  flanked  by  -125  bases  of  5’  and  3’  sequence  derived  from 
the  primary  transcript.  Incorporation  of  appropriate  cloning  sites  into  this  vector 
required  altering  only  3  positions  in  the  precursor.  This  cassette  was  inserted 
into  a  vector  equivalent  to  that  in  which  we  constructed  our  first-generation 
shRNA  library  (pSMI),  with  the  new  shRNA  vector  being  designated,  pSM2.  To 
distinguish  the  second-generation  shRNAs  from  those  in  our  first-generation 
library,  we  have  dubbed  these  shRNAmir. 

In  order  to  enable  the  use  of  small  RNA  design  rules  to  potentially 
enhance  the  efficacy  of  our  shRNAs,  it  was  necessary  to  understand  how  the 
shRNAmir  was  processed  in  vivo.  To  address  this  issue,  we  took  advantage  of 
existing  studies  of  miR30  biogenesis  that  mapped  its  processing  sites  45.  Cullen 
and  colleagues  also  mapped  cleavage  sites  for  a  modified  miR30  in  which  the 
mature  microRNA  sequences  had  been  replaced  by  sequences  targeting 
luciferase  45.  Using  this  information  as  a  guide,  we  designed  a  series  of 
constructs  predicted  to  generate  small  RNAs  targeting  mouse  p53,  human  PTEN 
and  luciferase.  To  verify  processing  sites,  we  transfected  human  293  cells  with 
pSM2  carrying  each  of  these  inserts  and  mapped  the  mature  3’  ends  of  the  guide 
and  passenger  strands  of  p53  and  PTEN  shRNAs  and  the  guide  strand  of  the 
luciferase  shRNA  (we  were  unable  to  detect  the  passenger  strand  for  this 
construct)  by  RACE-PCR  (Fig.  la;  Supplementary  Fig.  1).  Since  Northern 
Blotting  indicated  that  maturation  of  shRNAmirs  produced  22  nucleotide  species 
(Fig.  1b),  we  were  able  to  infer  the  5’  end  of  each  small  RNA  species.  We 
consider  the  possibility  of  two  processing  sites  at  each  end  of  the  shRNA  Since 
our  analysis  in  the  cases  of  p53  and  PTEN  shRNAs  could  not  distinguish 
between  processing  at  either  of  two  terminal  bases  (Fig.  la).  However,  in  the 
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case  of  Iuc1309,  the  answer  was  relatively  clear  that  the  guide  strand  was 
cleaved  in  most  cases  (8/10)  at  the  most  3’  indicated  site.  Two  out  of  ten 
sequences  indicated  cleavage  1  base  5’  of  that  site,  perhaps  reflecting  a  genuine 
heterogeneity  in  Drosha  cleavage  (RACE1,  RACE3;  Supplementary  Fig.  1).  We 
therefore  feel  it  most  likely  that  cleavage  is  most  prevalent  at  the  sites  indicated 
by  the  heavy  red  (drosha)  and  blue  (dicer)  lines,  but  our  data  is  consistent  with 
the  possibility  of  some  cleavage  also  occurring  at  the  sites  indicated  by  the 
lighter  lines. 

To  test  the  performance  of  pSM2  in  comparison  to  pSMI ,  we  used  both 
vectors  to  express  a  sequence  targeting  firefly  luciferase.  The  sequence  was 
inserted  such  that  an  identical  mature  small  RNA  would  be  generated  from  each 
construct  following  processing  in  vivo  (Supplementary  Fig.  2).  Of  primary 
concern  was  the  overall  amount  of  mature  small  RNA  that  would  be  generated 
from  each  construct.  This  was  critical  as  dose-response  experiments  for 
shRNAs  indicate  that  suppression  correlates  very  well  with  the  amount  of  RNA 
delivered  37 ,  particularly  at  the  relatively  low  doses  that  are  expected  to  be 
achieved  by  expression  from  transfected  or  integrated  constructs  as  compared  to 
directly  transfected  synthetic  RNAs.  We  transfected  pSMI-luc  and  pSM2-luc  into 
293  cells,  prepared  RNA  and  assayed  the  processed  small  RNA  by  northern 
blotting.  Cells  transfected  with  pSM2-luc  contained  roughly  12-fold  more  of  the 
small  RNA  than  did  cells  transfected  with  pSMI-luc  (Fig  1b). 

As  it  is  now  clear  that  primary  microRNAs  are  transcribed  mainly  by  RNA 
polymerase  II 28,29 ,  we  wished  to  compare  the  performance  of  shRNAm,rs  driven 
by  a  variety  of  different  promoters.  We  therefore  cloned  two  different  shRNAm,r 
cassettes  targeting  firefly  luciferase  downstream  of  three  different  RNA 
polymerase  III  promoters  (tRNA-val  46,  U6  42  and  HI  47)  and  two  different  RNA 
polymerase  II  promoters  (MSCV-LTR  and  CMV  48).  These  constructs  were  each 
prepared  in  a  plasmid  backbone  that  carried  no  other  mammalian  promoter. 

Each  was  transfected  in  combination  with  a  homologous  target  expression 
plasmid  encoding  firefly  luciferase  and  with  a  non-targeted  reporter  plasmid, 
encoding  Renilla  luciferase,  as  a  means  of  normalization.  We  compared  the 
performance  of  these  plasmids  in  a  four  different  cell  lines  including  two  from 
human  (HEK-293T,  MBA-MD-231),  one  from  mouse  (NIH-3T3)  and  one  from 
dog  (MDCK).  When  the  ability  of  these  constructs  to  suppress  the  luciferase 
target  was  compared  using  a  very  efficient  shRNAmir  (Iuc1309),  we  saw  virtually 
no  difference  in  the  performance  of  the  various  promoters  (Fig.  1c).  However, 
when  a  less  efficient  shRNAmir  (Iuc311)  was  used,  differences  became  apparent 
(Fig.  1c).  In  this,  and  numerous  experiments  with  other  shRNAs  (not  shown),  the 
U6  snRNA  and  CMV  promoters  gave  the  best  and  most  consistent  repression. 
The  MSCV  LTR,  tRNA  val  and  HI  promoters  worked  less  efficiently  overall. 
Based  upon  these  studies,  we  chose  to  retain  the  U6  snRNA  promoter  in  our 
base  library  vector,  pSM2.  It  is  important  to  note  that  all  of  our  studies  have  been 
carried  out  in  transient  assays.  In  situations  in  which  constructs  are  stably 
integrated  into  the  genome  at  single  copy,  different  configurations  of  promoters 
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and  flanking  sequences  perform  more  efficiently  than  U6  (see  accompanying 
paper  by  Dickins  et  al.,  and  Stegmeier  et  al.,  in  press).  However,  we  can  also 
suppress  gene  expression  by  stable  integration  of  pSM2  directly  (Supplementary 
Figure  3). 

Based  upon  these  tests  we  constructed  our  second-generation  shRNA 
library  vector,  pSM2,  as  shown  in  figure  2a.  The  shRNAmir  expression  cassette  is 
carried  within  a  self-inactivating  murine  stem  cell  virus.  Expression  of  the  small 
RNA  is  driven  by  the  U6  snRNA  promoter.  As  with  the  first-generation  shRNAs, 
a  U6  snRNA  leader  sequence  lies  between  the  promoter  and  the  5’  end  of  the 
miR-30  flanking  region.  Synthetic  oligonucleotides  encoding  shRNAs  are 
inserted  into  Xhol  and  EcoRI  sites  that  lie  within  the  miR-30  primary  microRNA 
sequences.  Immediately  following  the  miR-30  cassette  in  each  vector  is  a  RNA 
polymerase  III  termination  signal  and  a  randomly  generated  60  nucleotide 
barcode  region  to  facilitate  tracking  of  individual  hairpin  RNAs  in  complex 
populations.  This  feature  is  similar  to  that  described  for  our  first-generation  RNAi 
library  1i43.  The  pSM2  vector  is  also  designed  such  that  inserts  can  be  moved  by 
an  in  vivo  recombination  strategy  (MAGIC) 49.  Key  elements  of  this  feature  are 
the  presence  on  the  plasmid  backbone  of  a  protein-dependent  origin  of 
replication,  RK6y  and  a  transfer  origin  (oriT)  that  is  dependent  upon  a 
complementing  locus  in  the  host  cells.  To  permit  recombination  into  the  recipient 
plasmid,  the  shRNAmir  cassette  is  flanked  by  l-Scel  restriction  sites  which,  when 
cut  in  the  recipient  strain,  reveal  homology  regions  for  recombination  into  the 
recipient  plasmid.  One  key  difference  between  in  the  second-generation  shRNA 
libraries  is  that  the  5’  homology  region  is  the  miR-30  flanking  sequence  itself 
rather  than  an  artificial  sequence.  Thus,  in  the  second  generation  libraries,  the 
shRNA  cassette  is  transferred  without  the  U6  snRNA  promoter.  This  allows  the 
construction  of  mating  recipients  that  contain  inducible  or  tissue  specific 
promoters  (Stegmeier  et  al.,  in  press).  Finally,  the  pSM2  vector  can  be  selected 
for  integration  into  target  cells  using  a  puromycin  selection  marker. 

Six  different  shRNAmir  sequences  were  designed  for  each  of  34,71 1 
different  known  and  predicted  human  genes  and  32,628  mouse  genes.  In  each 
case,  shRNAs  were  designed  such  that  the  mature  small  RNA  generated  from 
each  construct  followed  thermodynamic  asymmetry  rules  that  have  been 
successfully  applied  for  the  design  of  siRNAs.  Based  upon  the  approaches  used 
to  map  the  termini  of  the  mature  small  RNAs  generated  from  our  vectors,  we 
could  not  definitively  distinguish  between  processing  at  two  possible  sites. 
Additionally,  Dicer  has  been  shown  to  generate  some  3’  end  heterogeneity  in 
processing  its  substrates.  Therefore  we  chose  sequences  that  gave  similar 
thermodynamic  profiles  even  if  cleavage  sites  were  shifted  by  a  base  at  either 
end  (the  cleavage  positions  indicated  in  Fig.  1). 

Construction  of  the  library  proceeded  stochastically  using  a  highly  parallel 
in  situ  synthesis  approach  for  oligonucleotide  production  (Fig.  2b).  Groups  of 
-22,000  oligonucleotides,  each  containing  a  different  shRNAmir  cassette  were 
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synthesized  on  glass-slide  microarrays  50.  Populations  were  eluted  from  the 
arrays  and  amplified  by  PCR.  In  order  to  insure  efficient  cloning,  the  pSM2 
backbone  was  inserted  into  a  lambda  phage  backbone  such  that  it  was  flanked 
by  loxP  sites.  ?i-pSM2  contains  unique  Xhol,  EcoRI  for  subcloning  amplified 
hairpins  and  unique  Fsel  and  Avrll  sites  for  insertion  of  bar  code  60mers.  X- 
pSM2  was  first  barcoded  with  a  mixed  library  of  random  60  nucleotide  sequences 
amplified  with  a  primer  set  which  included  one  primer  with  an  Fsel  site  and  one 
primer  with  a  T7  promoter  followed  by  the  Avrll  site.  Amplified  barcoded  X-pSM2 
libraries  were  lysogenized  into  a  strain  we  constructed  for  this  purpose, 
DHIOP^kp,  which  has  a  wild-type  pirl  gene  and  the  lambda  repressor,  cl,  to 
allow  A.-pSM2  to  replicate  as  a  42  kb  plasmid.  Approximately  108  CmRKmR 
lysogens  were  selected  and  served  as  a  bar  coded  library  pool.  Bar  coded  X- 
pSM2  was  CsCI  purified,  then  cleaved  with  EcoRI  and  Xhol  before  being  ligated 
to  gel  purified  EcoRI -Xhol  cleaved  pooled  shRNAmir  inserts  from  an  individual 
chip  and  packaged.  Average  library  sizes  were  -5x1 07  recombinants  per  pool. 

To  generate  pSM2  library  plasmid  pools,  the  phage  were  used  to  infect  an  E.  coli 
strain  we  constructed,  BUN25,  that  expresses  both  Cre  recombinase  and  the 
pirl -11 6  gene,  needed  for  high  copy  RK6y  replication.  Pooled  plasmid  libraries 
were  then  transformed  into  a  mating  competent  host  strain  (BW  F’DOT)  and 
individual  clones  were  sequenced  at  random.  Clones  with  perfect  inserts 
represented  between  25  and  50%  of  the  population,  and  these  were  selected 
and  saved  as  an  arrayed  set.  Accumulation  of  new  clones  from  each  pool  was 
monitored  dynamically  and  once  a  pool  began  to  yield  fewer  unique  clones  per 
sequencing  run,  sequencing  was  halted  and  the  pool  was  resynthesized  without 
those  sequences  that  had  already  been  obtained.  Approximately  70  chips  were 
reiteratively  synthesized  to  maximize  unique  sequencing.  Also,  once  3  or  more 
verified  shRNAs  were  obtained  for  any  given  gene,  the  remaining  shRNAs 
targeting  that  gene  were  also  withdrawn  from  population  selected  for  resynthesis. 

To  date,  we  have  sequence  verified  79,805  shRNAs  targeting  30,728 
human  genes  and  67,676  shRNAs  targeting  28,801  mouse  genes.  A  tabulation 
of  coverage  within  selected  functional  groups  can  be  found  in  Table  1  for  the 
mouse  and  human  libraries.  The  ultimate  goal  is  to  generate  3  shRNAs  for  each 
target  locus.  Existing,  sequence-verified  shRNAs  for  human  are  listed  in 
supplementary  table  1 ,  and  verified  mouse  shRNAs  are  listed  in  supplementary 
table  2.  The  full  collection,  updated  dynamically,  can  be  accessed  at 
http:Wcodex.cshl.edu. 

Validation  of  the  second-generation  shRNA  libraries 

To  test  the  efficiency  of  the  second-generation  shRNA  libraries,  we  took 
an  approach  that  we  had  previously  used  to  assess  the  performance  of  the  first 
generation  reagents  \  A  green  fluorescent  protein  (ZsGreen)  reporter  harboring 
the  PEST  domain  of  the  mouse  ornithine  decarboxylase  is  normally  degraded  by 
the  proteasome  51 .  Thus,  cells  harboring  a  destabilized  ZsGreen  expression 
plasmid  show  very  low  levels  of  fluorescence.  Interference  with  proteasome 
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function,  for  example  using  a  synthetic  proteasome  inhibitor,  causes 
accumulation  of  the  protein  and  a  corresponding  increase  in  fluorescence.  The 
protein  can  also  be  stabilized  by  suppression  of  any  gene  required  for 
proteasome  function.  Thus,  co-transfection  of  the  reporters  with  an  shRNAmir 
expression  plasmid  can  reveal  whether  a  target  protein  is  involved  in  the 
proteasome  pathway  (Fig.  3a).  Using  this  assay  as  a  primary  test  we  compared  a 
series  of  shRNAs  targeting  proteasomal  subunits  that  were  obtained  from  either 
the  first-  or  second-generation  libraries  (Fig.  3b;  Supplementary  Table  3). 

We  chose  a  total  of  53  shRNAs  targeting  13  different  genes  that  were 
known  to  be  involved  in  proteasome  function  (Fig.  3b).  24  were  from  the  first- 
generation  library  and  29  were  from  the  second-generation  library.  These  were 
co-transfected  with  the  reporter  in  combination  with  a  dsRED-encoding  plasmid 
that  allowed  normalization  of  the  transfections.  It  was  immediately  apparent  that 
the  second-generation  shRNAs  performed  substantially  better  than  the  first- 
generation  shRNAs.  We  noted  that  most  of  the  plasmids  derived  from  the 
second-generation  library  were  as  potent  as  the  best  shRNAs  that  had  been 
selected  from  a  screen  of  the  first-generation  library. 

To  gain  a  more  detailed  picture  of  the  performance  of  the  second- 
generation  libraries,  we  compared  results  from  the  proteasome  assay  for  36 
shRNAm,r  expression  plasmids  to  suppression  of  target  RNAs  as  measured  by 
semi-quantitative  RT-PCR.  Plasmids  were  transfected  into  HeLa  cells  with 
approximately  80%  efficiency,  as  measured  by  reference  to  a  co-transfected 
reporter  plasmid.  Despite  this  incomplete  transfection,  all  but  6  of  the  shRNAm"s 
reduced  the  levels  of  their  target  RNAs  by  -60%  or  more  with  13/36  of  the 
shRNAmirs  suppressing  their  targets  by  the  theoretical  maximum  of  -80%  (Fig. 

3c,  upper  panel;  Supplementary  Table  3).  Similar  results  were  seen  with  an 
additional  12  shRNAs  that  did  not  target  proteasome  subunits  (not  shown). 

These  studies  were  also  illuminating,  as  they  revealed  that  the  functional  assay 
in  some  cases,  e.g.,  pSMB3,  did  not  show  a  large  activation  of  the  reporter 
despite  substantial  suppression  of  the  targeted  mRNA  (Fig.  3c,  lower  panel). 
Thus,  the  functional  assay  underestimated  slightly  the  efficacy  of  the  library. 

To  test  the  performance  of  the  library  on  a  larger  scale,  we  assayed  a  set 
of  515  kinase  shRNAs  that  contained  within  it  47  hairpins  directed  to  proteasome 
subunits  using  the  phenotypic  assay  for  proteasome  function  via  a  high- 
throughput  protocol  in  96-well  plates  (Fig.  4).  In  this  context,  34/47  shRNAs 
targeting  the  proteasome  scored  as  positives  (72%)  as  compared  to  10  shRNAs 
that  had  not  previously  been  linked  to  proteasome  function  (1.9%).  A  secondary 
screen  of  those  44  potential  positives  from  the  primary  screen  again  revealed 
positive  signals  from  all  34  proteasomal  shRNAs.  However,  only  5/10  of  the  non- 
proteasomal  RNAs  continued  to  activate  the  reporter,  and  none  of  these  scored 
with  more  than  one  shRNA  in  the  library  (Supplementary  Table  4). 

Discussion 
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Since  the  discovery  that  an  RNAi  pathway  was  conserved  in  mammals, 
the  exploitation  of  this  silencing  response  as  a  genetic  tool  has  evolved  in 
concert  with  our  deeper  understanding  of  its  biochemical  mechanism.  The  initial 
applications  of  siRNAs  as  triggers  of  the  silencing  response  required 
comprehension  of  the  way  in  which  Dicer  processes  long  dsRNA  substrates  in 
Drosophila 52 .  Similarly,  studies  of  dicer-mutant  C.  eiegans  demonstrated  that 
endogenous  loci  could  encode  triggers  of  the  RNAi  machinery,  and  this  led  to  the 
notion  that  such  loci  could  be  altered  to  target  genes  for  experimental  silencing 
24 21 .  At  the  time  when  the  first  such  experiments  were  done,  the  nature  of  the 
primary  microRNA  transcript  was  unknown.  Indeed,  it  was  suspected  that  small 
RNAs  were  transcribed  from  the  genome  as  short  hairpin  precursors,  pre- 
miRNAs  that  were  converted  to  mature  small  RNAs  by  Dicer  cleavage.  What 
has  recently  become  clear  is  that  pre-miRNAs  are  simply  a  processing 
intermediate,  generated  by  cleavage  of  a  longer  primary  transcript  (pri-miRNA) 
by  the  Microprocessor  30‘34. 

Many  strategies  have  been  developed  for  producing  miRNA-like  triggers 
of  the  RNAi  pathway.  As  we  have  mapped  the  processing  sites  on  precursor 
shRNAmirs,  we  can  predict  what  small  RNA  is  generated  from  each  shRNAmir 
expression  vector.  This  enables  us  to  apply  siRNA  design  rules  to  shRNAmir 
expression  cassettes.  A  combination  of  increased  small  RNA  production  with 
better  shRNA  design  yielded  a  pronounced  increase  in  the  performance  of  these 
silencing  tools. 

Guided  by  these  design  strategies,  we  have  constructed  large  libraries  of 
sequence-verified  shRNAs  targeting  the  majority  of  the  known  and  predicted 
genes  in  the  human  and  mouse  genomes.  On  average,  each  locus  is  covered  by 
2  shRNAmirs  presently;  however,  the  ultimate  goal  is  to  have  3  sequence-verified 
shRNAmirs  for  each  gene.  The  second  generation  libraries  resemble  those  that 
we  have  previously  reported  in  that  they  reside  in  flexible  vectors  that  permit 
shuttling  of  shRNAmir  expression  cassettes  into  virtually  any  desired  expression 
vector  using  a  bacterial  mating  strategy  49.  A  unique  feature  of  the  second 
generation  library  is  that  the  expression  cassette  can  be  moved  without  the  need 
to  move  also  the  constitutive  U6  snRNA  promoter.  This  permits  large  scale 
construction  of  secondary  libraries  under  the  control  of  tissue  specific  and 
inducible  promoters.  Indeed,  regulated  expression  of  our  library  cassettes  from 
RNA  polymerase  II  promoters  has  been  shown  to  effectively  suppress  gene 
expression  both  in  cultured  cells  and  in  animals  (Dickins  et  al,  see  accompanying 
paper;  Stegmeier  et  al.,  in  press).  These  recipient  vectors  can  be  directly  used 
with  any  shRNAmir  encoded  by  the  library  described  herein. 

The  use  of  large-scale  resources  for  suppressing  gene  expression  via 
RNAi  promises  to  revolutionize  genetic  approaches  to  biological  problems  in 
numerous  model  systems.  For  human  cells,  both  siRNA  and  shRNA  collections 
have  previously  been  reported  and  are  generally  available  to  investigators  to 
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probe  a  wide  range  of  biological  questions.  The  libraries  described  here  should 
prove  useful  for  assessing  the  functions  of  individual  genes  and  for  taking 
genome-wide  approaches.  Strategies  reported  in  the  accompanying  paper  and 
by  Stegmeier  and  colleagues  will  permit  large-scale  application  of  these  tools  for 
screens  which  require  long-term  suppression  of  gene  expression  using  single¬ 
copy  integrants  or  inducible  repression.  Thus,  we  have  produced  coherent 
system  of  RNAi  reagents  with  utility  in  both  mouse  and  human  experimental 
systems. 
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Methods 


Construction  of  the  lysogenic  strain  DH10PX,kp  and  excision  strain  BUN25 

DH10PX,kp  [mcrA  A(mrr-hsdRMS-mcrBC)  <j) 80  lacZAM15  AlacX74  deoR  recAl 
endAI  araA139  A(ara,  leu)7697  galU  galK  X  rpsL  nupG  tonA  Xpirl-npt]  is  a 
strain  containing  Xc\  and  the  pirl  gene  that  was  constructed  in  order  to 
lysogenize  the  A,SM2-barcode  library  prior  to  introduction  of  the  hairpin 
fragments.  To  generate  this  strain,  A.«p  containing  the  pirl  and  KnR  genes  was 
constructed.  To  generate  ^Kp,  the  pirl  gene  was  amplified  from  BW23473  using 
primers  MZL393  and  MZL51 ,  and  cloned  into  the  pCR2.1  TOPO  TA  cloning 
vector.  The  pirl  gene  was  excised  from  the  above  clone  on  a  BamHI  fragment 
and  ligated  into  BamHI  cleaved  pSE356,  which  contains  an  npt  gene  and  a 
BamHI  restriction  site  flanked  by  two  1  kb  X  DNA  fragments  53  to  generate 
pSE356pirWT.  The  pir1KmR  fragment  was  recombined  onto  wild  type  X  by 
amplifying  A,on  LE392/pSE356pirWT  and  the  resulting  phage  were  collected  and 
used  to  infect  DHIOp.  100  pi  of  DHlOp  cells  were  infected  with  106  PFU  at  30°C 
for  30  minutes  in  LB  +  10  mM  MgS04,  diluted  with  900  pi  of  LB  incubated  at 
30°C  for  2  h  with  shaking,  and  plated  on  LB  containing  50  pg/ml  kanamycin  at 
37°C  overnight  to  select  A,«p  lysogens.  Lysogens  were  tested  for  the  ability  to 
lysogenize  X  vectors  containing  R6Ky  origins  of  replication  as  extrachromosomal 
elements.  A  strain  capable  of  doing  this  was  selected  and  named  DH10pA.Kp- 

The  BUN25  [F’  traD36  iacf  A(lacZ)M15  proA+B+le14  (McrA')  A(lac-proAB) 
thi  gyrA96  (Nalr)  endAI  hsdR17  (rk~mk)  relAI  glnV44  Xcre-npt  umuC::pir1 16-Frt 
sbcDC-Frt]  strain  containing  pirl-1 16  and  ere  was  constructed  to  allow  the 
conversion  of  A.SM2  shRNA  libraries  into  pSM2  shRNA  libraries.  A  PCR 
fragment  containing  pirl -11 6  gene  was  generated  using  primers  MZL393  and 
MZL51  (see  Supplementary  table  4),  cleaved  with  BamHI  and  ligated  into 
BamHI-cleaved  pUC18  to  generate  pML284.  A  fragment  containing  BsfBI-Frt- 
caf-Frt-A/ofel  (filled-in)  was  isolated  from  KD3  54  and  inserted  into  the  Smal  site  of 
pML284  to  generate  pML334.  A  Hpa\  fragment  containing  UmuDC  was  isolated 
from  pSE117  and  cloned  into  pBluescript  Xho\  (filled  in)-£coRV  to  generate 
pML236.  We  eliminated  one  of  the  BamHI  site  on  pML236  by  digesting  it  with 
Pst\-Xba\,  filling  in  with  T4  DNA  polymerase  and  ligating.  The  Fd-cat-Frt-pir116 
was  isolated  from  pML334  as  a  Kpn\-Sac\  (filled-in)  fragment  and  ligated  into 
Mlu\/BamH\  (filled-in)  cleaved  pML236-ABamHI  to  generate  pML346.  The  3.8 
kb  Kpn\-Sac\  UmuDC-Frt-cat-Frt-pirl  16-UmuC  fragment  from  pML346  was 
integrated  into  BNN132/pML104  by  homologous  recombination  using  the  X 
recombinase  expressed  from  pML104,  and  confirmed  by  colony  PCR.  The  cat 
gene  was  removed  by  FLP-mediated  excision  in  vivo  using  pCP20  55  which 
expresses  the  FLP  recombinase  to  generate  BUN24.  A  cassette  that  has  Frt- 
caf-Frt  flanked  by  50  bp  homology  to  sbcD  and  50  bp  homology  to  sbcC  was 
amplified  by  primers  MZL493/MZL494  and  using  KD3  as  a  template.  This 
cassette  was  used  to  replace  sbcD  and  part  of  sbcC  on  BNN132  by  homologous 
recombination  and  the  deletion  were  confirmed  by  colony  PCR.  The  strain  was 
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named  BNN132sbcDC-Frt-cat-Frt.  We  then  used  a  pair  of  outside  primers 
(MZL495/MZL496)  that  gave  about  500  bp  homology  regions  to  the  upstream  of 
sbcD  and  500  bp  homology  regions  to  the  sbcC  to  amplify  a  PCR  product  form 
BNN132sbcDC-Frt-cat-Frt  to  recombine  onto  the  sbcDC  region  of  BUN24.  The 
resulting  strain  was  named  BUN25  and  is  used  to  stabilize  inverted  repeats  in  E. 
coli  56. 

Library  vector  construction 

A  pair  of  loxP-A/ofl-loxP  duplexed  oligos  (MZL524/MZL525)  were  inserted  into 
the  pSM2  BsfXI  site  to  generate  pSM2c-loxP.  A  second  pair  of  duplexed  oligos 
(MZL541/  MZL542),  carrying  the  proper  restriction  sites  for  cloning  barcodes  into 
>.SM2,  were  inserted  into  the  Bbs\-Mlu\  sites  of  pSM2c-loxP  to  create  pML375. 
^ACT2  was  digested  with  Not\,  and  the  X  arms  were  gel  purified  and  ligated  to 
Not\  digested  pML375  to  generate  A.SM2.  The  ligation  mixture  was  packaged 
using  MaxPIax™  lambda  packaging  extracts  from  Epicentre.  We  selected  a 
A.SM2  lysogen  by  infecting  200  pi  of  BW23473  cell  (A6oo  =  ~0.8)  with  100  pi  of 
A.SM2  packaging  mix  in  the  presence  of  10  mM  MgS04  and  0.2  %  (w/v)  maltose, 
incubated  at  30°C  for  30  minutes  the  added  900  pi  of  LB  and  incubated  at  30°C 
for  2  hs  with  shaking  to  express  the  CmR  marker,  and  plated  on  LB  containing  17 
pg/ml  of  chloramphenicol  (Cm)  at  30°C  overnight.  The  proper  recombinants 
were  confirmed  by  restriction  analysis.  See  Supplementary  table  5  for 
sequences  of  referenced  oligonucleotides. 

Barcode  library  construction 

The  60  base  pair  barcodes, 

gaagactaatgcggccggcca(n)6ogggccctatagtgagtcgtatta,  were  amplified  using 
barcode  primer  1  (aaattgcaatgaagactaatgcggccggcca)  and  barcode  primer  2 
(atatatggacgcgtcctaggtaatacgactcactatagggccc).  The  PCR  conditions  were:  0.1 
pmol  of  barcodes,  50  pmol  of  each  primer,  25  nmol  of  each  dNTP,  and  2.5  U  of 
Taq  DNA  polymerase;  94°C  for  45  seconds,  (94°C  for  30  seconds,  55°C  for  30 
seconds,  and  72°C  for  30  seconds)  x  13,  72°C  for  10  minutes,  4°C  forever.  Ten 
PCR  reactions  were  pooled  together,  purified  using  a  QIAquick  PCR  purification 
kit,  digested  with  BamHI,  EcoRI,  Xho\  and  Sa/I  to  remove  these  sites  in  the 
barcodes,  and  gel  purified.  The  purified  barcodes  were  digested  with  Fsel  and 
AvrW  and  purified  using  the  QIAquick  gel  extraction  kit.  Two  micrograms  of  Fsel- 
Avrll  digested  XSM2  ligated  with  10  ng  of  Fse\-Avi\\  digested  barcodes  with  1  x 
ligation  buffer  and  0.5  pi  T4  DNA  ligase  in  a  5  pi  final  volume  at  16°C  overnight. 
The  ligation  mixture  was  packaged  and  amplified.  The  size  of  the  XSM2-barcode 
library  was  4.2  x  107.  We  used  20  ml  of  DHIOP^kp  cells  (Agoo  =  ~1)  to  lysogenize 
2  x  109  of  A,SM2-barcode  library  as  42  kb  plasmids.  The  cell  and  the  phage  were 
mixed  in  the  presence  of  10  mM  MgS04  and  0.2  %  (w/v)  maltose  and  incubated 
at  30°C  for  30  minutes,  added  200  ml  of  LB  to  recover  at  30°C  for  1  h  by  shaking. 
The  mixture  were  concentrated  by  centrifugation  at  4000  rpm  for  20  minutes, 
resuspended  in  3  ml  of  LB,  plated  on  10  large  LB/Cm  17  pg/ml,  and  incubated  at 
30°C  overnight.  The  cells  were  scraped  from  plates  and  grown  in  3  L  of  TB 
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containing  17  pg/ml  of  Cm  overnight.  Supercoiled  XSM2-barcode  library  DNA 
was  prepared  by  cesium  chloride.  The  lysogenization  efficiency  was 
approximately  30%. 

Oligonucleotide  Cleavage  and  PCR  Amplification 

To  harvest  oligonucleotides,  we  treated  microarrays  for  2  h  with  2-3  mL  of  35% 
NH4OH  solution  (Fisher  Scientific)  at  room  temperature.  We  transferred  the 
solution  to  1.5-mL  microcentrifuge  tubes  and  subjected  it  to  speed  vacuum  drying 
at  medium  heat  (~55  °C)  overnight.  We  resuspended  the  dried  material  in  200  pi 
of  RNase/DNase  free  water  and  performed  PCR  amplification  in  50pl  reaction 
volumes  using  Invitrogen’s  Platinum®  Pfx  DNA  Polymerase.  To  obtain  a 
sufficient  amount  of  PCR  product,  four  50  pi  reactions  were  required  for  each 
sample.  Each  reaction  contained  2X  Pfx  PCR  amplification  buffer,  0.3  mM  of 
each  dNTP,  1  mM  MgSCU,  0.3  uM  of  each  primer,  0.5X  PCR  enhancer  solution, 
0.5  units  of  Platinum®  Pfx  DNA  Polymerase,  and  10  pi  of  template  DNA.  The 
primers  used  for  amplification  were  5'-mir30-PCR-xhol-F  (5’ 
CAGAGGCT CGAGAAGGT AT ATT G CTGTT G AC AGT G AGCG  3')  and  3’-mir30- 
PCR-ecorl-R  (5'  CGCGGCGAATTCCGAGGAGTAGGCA  3').  After  an  initial 
denaturation  step  of  94°C  for  5  min,  DNA  amplification  occurred  through  25 
cycles  of  denaturing  at  94°C  for  45  seconds  and  annealing  and  extension  at  68° 
for  1  min  and  15  sec  followed  by  a  final  7  minute  extension  at  68°.  We  then 
combined  the  four  reactions  into  one  tube,  cleaned  up  the  PCR  product  by  use 
the  QIAGEN®  Minelute  PCR  Purification  Kit  and  eluted  in  a  total  volume  of  26  pi. 


ShRNA  library  construction 

The  A,SM2-barcode  library  and  shRNA  PCR  products  were  digested  with  EcoRI 
and  Xho\  overnight  and  gel  purified.  Ligations  with  shDNA  oligos  were  set  up  as 
following:  1.5  pg  of  Xho\-EcoR\  cleaved  vector,  8-10  ng  of  Xbol-EcoRI  cleaved 
inserts  generated  from  the  PCR  of  shDNA  oligos  from  the  parallel  microarray 
synthesis,  1  pi  of  10  x  ligation  buffer,  0.5  pi  of  T4  DNA  ligase,  and  water  to  10  pi 
final.  The  ligation  mixtures  were  incubated  at  16°C  for  overnight  and  packaged. 
We  typically  observed  30-  to  90-fold  stimulation  of  plaque  forming  units  (PFU) 
and  2  x  107to  8  x  107  PFU  total  for  each  library  pool.  We  typically  amplify  2  x 
107  PFU  for  each  pool  to  generate  a  stock.  To  verify  the  ligation  efficiency,  we 
excised  10  pi  of  package  mix  by  infecting  100  pi  of  BUN25  (A  =  ~0.5)  and 
selected  colonies  on  LB/Cm  30  pg/ml  30°C  overnight.  Colony  PCR  was 
performed  using  forward  (ggacgaaacaccgtgctcgc)  and  reverse  primer 
(ttctgcgaagtgatcttccg)  and  85  to  95  %  correct  sized  inserts  were  typically 
observed  with  some  containing  multiple  inserts.  To  generate  plasmid  DNA  from 
these  libraries,  we  typically  excised  5  x  107  PFUs  through  infection  of  BUN25 
cells  as  described  earlier.  The  cells  were  scraped  from  plates  and  grown  in  2  L  of 
LB  plus  13  g/l  of  circle  growth  37°C  for  7  to  8  h.  Cesium  chloride  method  was 
used  to  prepare  DNA.  DNAs  were  transformed  into  BW23474  F'DOT  SbcC  ,  and 
individual  clones  were  sequenced  using  primer5’ 

(TGTGGAAAGGACGAAACACC).  Correct  clones  were  individually  rearrayed  to 
form  the  final  library. 
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Small  RNA  Northern  Blots 

293  cells  were  transfected  in  10cm  dishes  at  60%  confluency  with  15  ug  of 
shRNA  plasmid  DNA  along  with  5  ug  of  pDsRed-NI  (Clontech)  using  TransIT- 
LT1  (Mirus).  48hs  post  transfection,  transfection  efficiency  was  confirmed  by 
estimating  the  percentage  of  cells  expressing  DsRed  (~80%)  and  then  total  RNA 
was  Trizol  extracted  and  purified.  Small  RNA  northern  blots  were  carried  out  as 
described  in  57  using  30  ug  RNA/lane.  For  hairpins  targeting  EGFP  at  starting 
position  481  (Fig  1b),  northern  probes  were  DNA  oligos  corresponding  to  the 
anti-sense  strand  (ccggcatcaaggtgaacttcaa)  of  the  mature  RNA. 

Proteasome  assays 

Bacteria  cultures  were  grown  in  96  well  plates  for  36  h  in  GS96  media  (Biol 01). 
Plasmid  DNA  was  extracted  using  Quiagen  Ultrapure  plasmids  minipreps  in  a  96 
well  plate  format.  DNA  concentrations  were  determined  by  mixing  an  aliquot  of 
each  sample  with  picogreen  (Molecular  Probes)  and  determining  fluorescence  on 
a  Victor2  plate  reader.  HEK  293T  cells  were  plated  in  96  well  optic  plates 
(Corning)  at  IxlO5  cells  per  ml.  For  the  proteasome  assay,  12.5  ng  of  the 
plasmid  dsRed  N-1  (Clontech),  5  ng  of  the  Zsprosensor  (Clontech)  and  75  ng  of 
each  individual  shRNA  construct  were  cotransfected  per  well  using  0.3  pi  of  LT-1 
(Mirus)  transfection  reagent.  After  24  hs  the  transfection  media  was  replaced. 
After  72  h,  media  was  removed  and  replaced  with  PBS  in  order  to  read 
fluorescence.  Fluorescence  signals  were  read  on  a  Victor2  plate  reader.  Signals 
in  the  green  channel  were  normalized  to  transfection  efficiency  using  customized 
scripts  with  fluorescence  in  the  red  channel  serving  as  a  normalization  criterion. 
Cut-offs  were  assigned  by  using  control  shRNA  transfections  to  determine  the 
range  for  a  negative  outcome. 

Plasmid  Transfections  and  mRNA  Quantitation 

HeLa  cells  were  seeded  at  0.5  x  1 05  cells/well  in  24-well  plates  and  transfected 
24  h  later  with  1  ug/well  of  the  appropriate  plasmid.  Each  plasmid  was  delivered 
to  4  wells  by  use  of  Lipofectamine  2000  (Invitrogen)  according  to  the 
manufacturer’s  protocols.  Transfection  efficiency  was  determined  by  parallel 
transfection  of  a  GFP-expressing  plasmid  and  the  percentage  of  fluorescent  cells 
assayed  by  flow  cytometry.  For  analysis  of  target  gene  mRNA  knock  down,  cell 
lysates  were  collected  24  h  after  transfection,  and  total  RNA  was  prepared  by 
use  of  RNeasy  columns  (Qiagen)  following  the  manufacturer’s  protocols. 
Messenger  RNA  quantitation  was  performed  by  Real-time  PCR  of  reverse 
transcription  products,  using  available  Applied  Biosystems  TaqMan™  primer 
probe  sets,  and  the  percent  mRNA  remaining  was  determined  by  comparison 
with  mRNA  levels  from  cells  treated  with  transfection  reagent  alone. 
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Table  1  Coverage  by  functional  group 


Human  Mouse 


Group 

Human  Genes  Hairpins 

Genes 

Cancer 

859 

3082 

890 

Cell  cycle 

531 

2166 

482 

Checkpoint 

123 

541 

116 

DNA  repair 

118 

512 

130 

DNA  replication 

238 

961 

248 

Enzymes 

2943 

10456 

2818 

GPCRs 

669 

2101 

663 

Kinases 

618 

2648 

575 

Dual  Specificity 
Phosphatases 

35 

144 

32 

Tryosine 

Phosphatases 

36 

184 

33 

Phosphotases 

206 

765 

187 

Proteases 

454 

1431 

441 

Proteolysis 

302 

1458 

270 

Signal  Transduction 

2650 

9046 

2541 

Protein  Trafficking 

476 

1596 

458 

Transcription 

820 

2865 

767 

Apoptosis 

581 

2061 

558 

Mouse 

Hairpins 

2524 

1552 

367 

355 

719 

8302 

1795 

2250 

114 

166 

628 

1168 

858 

7274 

1309 

2209 

1538 
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Figure  Legends 


Figure  1.  Design  and  structure  of  shRNAm,r  cassettes,  (a)  A  comparison  of 
the  structures  of  several  silencing  triggers  is  shown.  These  include  an  siRNA,  a 
portion  of  the  shRNA  precursor,  as  generated  from  our  first-generation  design  in 
pSMI,  and  a  segment  of  the  shRNAmir  precursor  produced  by  pSM2.  The 
sequence  of  the  target  site  (sense  orientation)  from  firefly  luciferase  (Iuc1309, 
see  c)  is  shown  in  blue  (passenger  strand)  with  the  guide  strand  shown  in  red. 
For  pSM2,  mapped  potential  cleavage  sites  for  Dicer  and  Drosha  are  indicated 
by  blue  and  red  lines  respectively,  (b)  Northern  blotting  was  used  to  detect  the 
mature  small  RNA  produced  after  transfection  of  HEK-293T  cells  with  shRNA 
and  shRNAmir  cassettes  expressed  from  pSMI  and  pSM2,  respectively,  by  the 
U6  snRNA  promoter.  In  neither  case  was  significant  accumulation  of  pre-miRNA 
observed.  Transfection  rates  were  normalized  using  a  co-delivered  dsRED 
expression  plasmid,  (c)  Five  different  promoters  (human  tRNAval,  Human  HI 
RNA,  Human  U6  snRNA,  MSCV  LTR  and  Human  CMV  IE,  as  indicated)  were 
tested  for  their  ability  to  drive  shRNAmir  expression  and  silence  luciferase  in 
transient  transfections.  Two  different  shRNAs  were  used,  a  highly  efficient 
shRNA  (Iuc1309)  an  a  less  efficient  shRNA  (Iuc31 1).  In  each  case,  the  level  of 
firefly  luciferase  was  normalized  to  a  non-targeted  Renilla  luciferase.  Controls 
with  empty  vectors  lacking  a  hairpin  insert  are  also  shown. 

Figure  2.  Construction  of  the  second-generation  library,  (a)  The  pSM2c 
vector  contains  a  U6  promoter,  a  U6  terminator  following  mir3’,  a  self  inactivating 
retroviral  backbone;  two  bacterial  antibiotic  resistance  markers  kanamycin  and 
chloramphenicol;  a  protein-dependent  origin  (RK6y);  a  mammalian  selectable 
marker  (puromycin)  driven  by  the  PGK  promoter;  a  homology  region  (HR2)  for 
use  in  bacterial  recombination  and  a  randomly  generated  60  mer  barcode 
sequence.  The  shRNAmir  inserts  were  cloned  between  the  5’  and  3’  flanking 
sequences  derived  from  the  mir-30  primary  transcript  using  Xhol  and  EcoRI 
restriction  sites.  The  nucleotide  positions  for  sites  in  an  excised  version  of  an 
empty  vector  (no  shRNA  or  barcode)  are  given,  (b)  Construction  of  the  second- 
generation  libraries  began  with  the  generation  of  a  lambda  derivative  of  pSM2 
that  contained  unique  EcoRI ,  Xhol,  Fsel  and  Avrll  sites,  the  latter  two  for 
insertion  of  bar  codes.  A  bar  coded  pre-library  was  generated  by  the  ligation  of 
PCR  amplified  random  60  mers  into  Fsel -Avrll  cleaved  A,pSM2  to  generate  a 
bar-coded  library  pool  (upper  right).  The  bar-codes  A,pSM2  was  converted  into  a 
shRNA  library  by  insertion  of  PCR  amplified  shRNA  constructs  prepared  by  in 
situ  synthesis  of  inserts  on  a  microarray  in  pools  of  -22,000  into  the  EcoRI -Xhol 
cleaved  pre-library  (upper  right).  Packaged  phage  were  amplified  and  used  to 
infect  BUN25,  which  express  Cre  recombinase  and  pirl-116  for  pSM2 
replication.  Each  excision  event  gave  rise  to  a  Kanr+Cmr  resistant  colony. 

These  were  pooled  and  used  for  preparation  of  library  DNA.  This,  in  turn,  was 
transformed  into  BW  F’DOT  and  individual  colonies  were  selected  for  sequence 
analysis. 
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Figure  3.  Validation  of  the  second-generation  library,  (a)  A  schematic 
representation  of  the  phenotypic  assay  for  proteasome  function  (see  text)  is 
shown.  (b)  Thirteen  proteasome  subunits  were  chosen  because  of  their 
representation  in  both  the  first-  and  second-generation  libraries  (for  sequences 
see  Supplementary  table  3).  ShRNA  expression  clones  corresponding  to  each 
were  assayed  for  activation  of  the  proteasome  reporter.  Blue  bars  indicate  first- 
generation  clones  while  green  indicate  second-generation  clones.  In  all  cases, 
the  activity  of  the  proteasome  reporter  (green  channel)  was  normalized  for 
transfection  using  a  dsRED  expression  plasmid,  (c)  In  a  separate  study,  36 
different  proteasome  shRNAs  (for  sequences  see  Supplementary  table  3)  were 
tested  for  their  ability  to  suppress  their  target  RNAs  (upper  panel).  QRT-PCRs 
were  performed  24  h  after  transfection  of  HeLa  cells  at  an  average  efficiency  of 
80%  as  measured  by  a  co-transfected  normalization  reporter  (dsRED).  The 
hypothetic  maximum  suppression,  as  calculated  by  transfection  efficiency,  is 
indicated  by  the  black  line.  For  comparison,  functional  assays  for  proteasome 
inhibition  were  performed  in  parallel  (lower  panel). 

Figure  4.  Performance  of  the  second-generation  library  in  a  small-scale 
high-throughput  screen.  47  shRNAs  targeting  proteasome  subunits  were 
distributed  among  a  series  of  562  hairpins  targeting  human  kinases  (upper 
panel).  The  lower  left  panel  shows  the  negative  (FF)  and  the  positives  controls 
(ATPase  1.1  to  1.3)  from  first  (pSMI)  and  second  (pSM2)  libraries.  The  lower 
right  panel  shows  the  shRNAs  that  displayed  accumulation  of  the  proteasome 
reporter  over  the  cut-off  (2-fold  or  greater  activation;  yellow  line).  These  are 
highly  enriched  for  proteasome  shRNAs  (red).  In  blue  are  10  additional  non- 
proteasomal  shRNAs  that  also  scored  positive  in  the  screen.  Of  these,  5  were 
also  positive  on  a  retest  of  individual  clones.  The  sequences  of  the  shRNAs,  in 
order  from  left  to  right,  for  the  lower  right  panel  are  given  in  Supplementary  Table 
4. 
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Proteasome  subunits 
Non  proteasome 


P53_1224  Guide  Strand  (Drosha  site) 

3' -TGTTCATGTACACATT-5'  RACE  primer 
3 ' -AGGUGAUGUUCAUGUACACAUU-5 '  -  predicted  small  RNA 
5'-. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE1 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE2 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA. . .  RACE 3 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE 4 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE 5 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE 6 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE 7 
.  .  . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA.  . .  RACE 8 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA. . .  RACE 9 
. . . TTTTTTTTTTTTCCACTACAAGTACATGTGTAA . . .  RACE10 

P53_1224  Passenger  Strand  (Dicer  site) 

3' -TGTACATGAACATCAC-5'  RACE  primer 
3' -AUAAUGUGUACAUGAACAUCAC-5'  -  predicted  small  RNA 
5 '  -  .  .  .  TTTTTTTTTTTTATTACACA'TGTACTTGTAGTG .  .  .  -3 '  RACE1 
5 ' - . . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . . -3 '  RACE 2 
5'-. . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . .-3'  RACE 3 
5'-. . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . .-3'  RACE 4 
5 ' - . . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . . -3 '  RACE 5 
5'-...  TTTTTTTTTTTATTACACATGTACTTGTAGTG. . .-3'  RACE 6 
5'-. . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . .-3'  RACE7 
5'-...  TTTTTTTTTATTACACATGTACTTGTAGTG. . .-3'  RACE 8 
5'-...  TTTTTTTTTTTATTACACATGTACTTGTAGTG. . .-3'  RACE 9 
5'-. . . TTTTTTTTTTTTATTACACATGTACTTGTAGTG . . .-3'  RACE10 


Luc_1309  Guide  Strand  (Drosha  site) 

3' -ACTTCAGAGACTAATT-5'  RACE  primer 
3' -UGGCGGACUUCAGAGACUAAUU-5'  -  predicted  small  RNA 
5' - . . .  TTTTTTTTTTT . CCGCCTGAAGTCTCTGATTAA . . .-3'  RACE1 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .-3'  RACE 2 
5'-. . . TTTTTTTTTTTT. CCGCCTGAAGTCTCTGATTAA. . .-3'  RACE 3 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .-3'  RACE 4 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .-3'  RACE 5 
5 '  -  .  .  . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . . -3 '  RACE  6 
5'-...  TTTTTTTTTTACCGCCTGAAGTCTCTGATTAA. . .-3'  RACE 7 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .-3'  RACE 8 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .-3'  RACE 9 
5'-. . . TTTTTTTTTTTTACCGCCTGAAGTCTCTGATTAA . . .3' -  RACE 10 
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TTTTTTTTTTTTGGGATTTCCTGCAGAAAGACT-3' 

TTTTTTTTTTTTGGGATTTCCTGCAGAAAGACT_3, 

TTTTTTTTTTTTGGGATTTCCTGCAGAAAGACT_3, 
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.  . TTTTTTTTTTTTAAGTCTTTCTGCAGGAAATCC-3' 
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.  . TTTTTTTTTTTTAAGTCTTTCTGCAGGAAATCC-3' 
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. . TTTTTTTTTTTTAAGTCTTTCTGCAGGAAATCC-3 ' 
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Supplementary  Figure  1.  Mapping  of  Dicer  and  Drosha  cleavage  sites.  To 

map  cleavage  sites  for  small  RNAs  generated  by  pSM2,  we  used  3’  RACE.  293 
cells  were  transfected  with  constructs  corresponding  to  p53_1223,  PTEN_1137 
and  luc_1309.  Small  RNAs  were  converted  to  cDNA  after  tailing  with  polyA 
polymerase  and  amplified  using  a  specific  primer  (as  indicated  for  each  small 
RNA)  and  an  anchored  dT  primer  (according  to  the  manufacturer’s  instructions, 
Roche).  PCR  products  were  cloned  into  the  topo-TA  vector  (Invitrogen)  and  10 
clones  were  sequenced  for  each  of  the  PTEN,  p53  and  luc  guide  strands  and  the 
PTEN  and  p53  passenger  strands.  Since  the  RNAs  were  A-tailed,  the  presence 
or  absence  of  predicted  terminal  A  residues  was  ambiguous,  and  they  are 
therefore  indicated  in  red. 


Supplementary  Figure  2 

shRNA  (first-generation) 

gugcucgcuucggcagcacauauacuaUUAAUCAGAGACUUCAGGCGUUCAACGAIhmgg 

AUCGUUGACCGCCUGAAGUCUCUGAUUAAUU 


shRNAmir  (second-generation) 

gugcucgcuucggcagcacauauacuagucgacuagggauaacaggguaauuguuugaaugaggcuucaguacuu 

uacagaaucguugccugcacaucuuggaaacacuugcugggauuacuucuucagguuaacccaacagaaggcucga 

gaagguauauugcuguugacagugagcgccCGCCUGAAGUCUCUGAUUAAUAgwgoragccaca 

gaug-imUUAAUCAGAGACUUCAGGCGGUugccuacugccucggaauucaaggggcuacuuuagg 

agcaauuaucuuguuuacuaaaacugaauaccuugcuaucucuuugauacauu 


Underline :  Leader  sequence 
CAPITAL:  shRNA 

CAPITAL  BOLD:  sense  target  sequence 
Italic:  shRNA  loop 


Supplementary  Figure  2.  The  complete  insert  sequences  for  pSMI  and  pSM2  containing 
luciferase  shRNA  are  shown  along  with  their  most  stable  potential  secondary  structures 
as  predicted  by  RNA  fold 
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Supplementary  Figure  3.  Stable  suppression  by  pSM2.  HCT116  cells  were  infected  with 
pSM2_hsP53_2120  (v2HS_93615)  and  selected  as  a  population  for  resistance  to  puromycin. 

To  induce  p53,  populations  were  treated  with  50  microM  etoposide  for  24  prior  to  lysis  for  Western 
blotting.  For  comparison  HCT116-p53null  cells  were  also  examined.  Lysates  were  examined 
for  levels  of  p53,  a  p53  target,  p21,  and  -actin  as  a  control.  Titers  were  approxmately  1x10A6/ml. 


